CPU times: user 43.2 s, sys: 17.9 s, total: 1min 1s Wall time: 20.8 s
level0 topic_0: ['data', 'sequencing', 'analysis', 'cells', 'cell', 'dna', 'cancer', 'methods', 'used', 'gene'] topic_1: ['reads', 'genome', 'read', 'data', 'alignment', 'reference', 'variant', 'sequencing', 'genomes', 'coverage'] level1 topic_0: ['sequencing', 'dna', 'cancer', 'resistance', 'gene', 'ngs', 'genes', 'using', 'detection', 'protein'] topic_1: ['variant', 'kraken', 'variants', 'regions', 'normalization', 'benchmark', 'species', 'scone', 'snps', 'wgs'] topic_2: ['reads', 'genome', 'read', 'alignment', 'assembly', 'coverage', 'genomes', 'reference', 'contigs', 'graph'] topic_3: ['data', 'analysis', 'cell', 'cells', 'methods', 'metagenomic', 'used', 'expression', 'nat', 'metagenomics'] level2 topic_0: ['alignment', 'bioinformatics', 'tools', 'algorithms', 'umap', 'mapping', 'short', 'length', 'algorithm', 'fuzzy'] topic_1: ['genomes', 'contigs', 'lineage', 'contig', 'tree', 'assemblies', 'supplementary', 'assigned', 'samples', 'grapetree'] topic_2: ['cell', 'cells', 'methods', 'expression', 'clustering', 'number', 'dataset', 'model', 'clusters', 'scvis'] topic_3: ['variant', 'variants', 'ngs', 'wgs', 'normalization', 'scone', 'calling', 'depth', 'performance', 'lrs'] topic_4: ['mash', 'sketch', 'aligner', 'fastp', 'hash', 'mappings', 'quality', 'size', 'mapq', 'file'] topic_5: ['species', 'learning', 'tumor', 'detection', 'deep', 'genomics', 'liquid', 'circulating', 'pubmed', 'patients'] topic_6: ['metagenomic', 'metagenomics', 'microbiome', 'args', 'benchmark', 'regions', 'resistance', 'microbial', 'usa', 'pubmed'] topic_7: ['assembly', 'coverage', 'graph', 'set', 'illumina', 'distance', 'bias', 'forensic', 'ajb', 'ion'] level3 topic_0: ['variant', 'read', 'genome', 'aligner', 'resfinder', 'mappings', 'resistance', 'mapq', 'reference', 'reads'] topic_1: ['genomes', 'assembly', 'reads', 'using', 'genome', 'mash', 'benchmark', 'regions', 'variants', 'coverage'] topic_2: ['data', 'clustering', 'cell', 'cells', 'normalization', 'genes', 'methods', 'expression', 'number', 'used'] topic_3: ['data', 'coverage', 'learning', 'genome', 'deep', 'bias', 'sequencing', 'genomics', 'illumina', 'human'] topic_4: ['read', 'alignment', 'reads', 'genome', 'reference', 'sequencing', 'algorithms', 'bioinformatics', 'dna', 'scone'] topic_5: ['sequencing', 'species', 'data', 'wgs', 'reads', 'lrs', 'variant', 'depth', 'using', 'kraken'] topic_6: ['umap', 'data', 'usa', 'university', 'fuzzy', 'author', 'set', 'qiime', 'manuscript', 'manifold'] topic_7: ['analysis', 'data', 'cell', 'cells', 'methods', 'sequencing', 'gatk', 'expression', 'gene', 'genome'] topic_8: ['snps', 'drosophila', 'variant', 'amino', 'megares', 'gene', 'snpeff', 'variants', 'args', 'protein'] topic_9: ['cells', 'data', 'args', 'resistance', 'scvis', 'resistome', 'cell', 'dataset', 'bipolar', 'clusters'] topic_10: ['cancer', 'dna', 'data', 'sequencing', 'analysis', 'tumor', 'detection', 'pubmed', 'liquid', 'circulating'] topic_11: ['reads', 'graph', 'assembly', 'ajb', 'contigs', 'bruijn', 'distance', 'spades', 'genome', 'edge'] topic_12: ['metagenomic', 'analysis', 'data', 'sequencing', 'metagenomics', 'microbiome', 'used', 'microbial', 'dna', 'pubmed'] topic_13: ['data', 'nat', 'methods', 'integration', 'analysis', 'cell', 'reduction', 'sequencing', 'cells', 'joint'] topic_14: ['sequencing', 'dna', 'ngs', 'cancer', 'analysis', 'data', 'genome', 'forensic', 'technology', 'variant'] topic_15: ['kraken', 'reference', 'data', 'sequences', 'genes', 'used', 'lineage', 'sequence', 'genomes', 'genome']
Results for level 0 Sparsity Phi: 0.381 Sparsity Theta: 0.000 Kernel contrast: 0.891 Kernel purity: 0.938
Results for level 1 Sparsity Phi: 0.567 Sparsity Theta: 0.000 Kernel contrast: 0.846 Kernel purity: 0.890
Results for level 2 Sparsity Phi: 0.735 Sparsity Theta: 0.009 Kernel contrast: 0.839 Kernel purity: 0.879
Results for level 3 Sparsity Phi: 0.000 Sparsity Theta: 0.000 Kernel contrast: 0.461 Kernel purity: 0.388
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 0 | 1 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 0 | 1 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 0 | 1 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 0 | 1 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 0 | 1 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 0 | 1 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 0 | 1 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 0 | 1 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 0 | 1 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 0 | 1 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 0 | 1 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 0 | 1 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 0 | 1 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 0 | 1 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 0 | 1 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 0 | 1 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 0 | 1 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 0 | 1 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 0 | 1 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 0 | 1 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 0 | 1 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 0 | 1 |
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 0 | 1 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 0 | 1 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 0 | 1 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 1 | 0 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 1 | 0 |
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 1 | 0 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 1 | 0 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 1 | 0 |
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 1 | 0 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 1 | 0 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 1 | 0 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 1 | 0 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 1 | 0 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 1 | 0 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 1 | 0 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 1 | 0 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 1 | 0 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 1 | 0 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 1 | 0 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 1 | 0 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 1 | 0 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 1 | 0 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 1 | 0 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 1 | 0 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 1 | 0 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 1 | 0 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 1 | 0 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 1 | 0 |
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 0 | 2 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 0 | 3 |
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 0 | 3 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 0 | 3 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 0 | 3 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 0 | 3 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 0 | 3 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 0 | 3 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 1 | 0 |
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 1 | 0 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 1 | 0 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 1 | 0 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 1 | 0 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 1 | 0 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 1 | 0 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 1 | 0 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 1 | 0 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 1 | 0 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 1 | 0 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 1 | 0 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 1 | 0 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 1 | 0 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 1 | 0 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 1 | 0 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 1 | 0 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 1 | 2 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 2 | 2 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 2 | 2 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 2 | 2 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 2 | 2 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 2 | 2 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 2 | 2 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 2 | 2 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 2 | 2 |
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 3 | 1 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 3 | 1 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 3 | 1 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 3 | 1 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 3 | 1 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 3 | 1 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 3 | 1 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 3 | 1 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 3 | 1 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 3 | 1 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 3 | 1 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 3 | 1 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 3 | 1 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 3 | 1 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 3 | 1 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 3 | 1 |
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 0 | 1 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 0 | 1 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 0 | 1 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 0 | 1 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 0 | 1 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 0 | 1 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 0 | 1 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 1 | 4 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 1 | 4 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 1 | 4 |
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 1 | 4 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 2 | 1 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 2 | 2 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 2 | 2 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 2 | 2 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 2 | 2 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 2 | 2 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 2 | 2 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 2 | 2 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 2 | 2 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 2 | 7 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 3 | 5 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 3 | 5 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 3 | 5 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 3 | 5 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 3 | 5 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 3 | 5 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 3 | 5 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 3 | 5 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 4 | 0 |
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 4 | 0 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 4 | 0 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 4 | 0 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 4 | 0 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 4 | 0 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 5 | 7 |
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 5 | 7 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 5 | 7 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 5 | 7 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 5 | 7 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 5 | 7 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 6 | 3 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 6 | 3 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 6 | 3 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 6 | 3 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 7 | 6 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 7 | 6 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 7 | 6 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 7 | 6 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 7 | 6 |
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 0 | 7 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 0 | 7 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 0 | 7 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 0 | 7 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 0 | 7 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 1 | 0 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 2 | 2 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 2 | 2 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 3 | 5 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 3 | 5 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 3 | 5 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 3 | 5 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 4 | 1 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 5 | 6 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 5 | 6 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 5 | 6 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 5 | 6 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 6 | 15 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 6 | 15 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 6 | 15 |
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 7 | 3 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 7 | 3 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 7 | 3 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 7 | 3 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 8 | 10 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 8 | 10 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 8 | 10 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 8 | 10 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 8 | 10 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 9 | 9 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 9 | 9 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 9 | 9 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 10 | 4 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 10 | 14 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 11 | 8 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 11 | 8 |
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 11 | 8 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 12 | 13 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 12 | 13 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 12 | 13 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 13 | 12 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 13 | 12 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 13 | 12 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 13 | 12 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 14 | 14 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 14 | 14 |
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 15 | 11 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 15 | 11 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 15 | 11 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 15 | 11 |
CPU times: user 54.9 s, sys: 15.2 s, total: 1min 10s Wall time: 26.3 s
level0 topic_0: ['data', 'network', 'social', 'used', 'learning', 'spam', 'information', 'embedding', 'networks', 'research'] topic_1: ['model', 'proceedings', 'conference', 'information', 'learning', 'knowledge', 'text', 'language', 'data', 'methods'] level1 topic_0: ['data', 'network', 'embedding', 'information', 'news', 'graph', 'clustering', 'learning', 'models', 'networks'] topic_1: ['clinical', 'social', 'patent', 'text', 'model', 'classification', 'data', 'spam', 'learning', 'detection'] topic_2: ['knowledge', 'model', 'articles', 'tax', 'used', 'training', 'disease', 'set', 'research', 'quality'] topic_3: ['proceedings', 'conference', 'language', 'extraction', 'computational', 'knowledge', 'association', 'linguistics', 'learning', 'methods'] level2 topic_0: ['clustering', 'areas', 'topic', 'institutions', 'recommendation', 'terms', 'thematic', 'technology', 'performance', 'recommendations'] topic_1: ['model', 'text', 'information', 'used', 'models', 'using', 'clinical', 'conference', 'classification', 'language'] topic_2: ['event', 'sket', 'reports', 'pathology', 'engineering', 'events', 'argument', 'cancer', 'topics', 'archetype'] topic_3: ['articles', 'example', 'dream', 'citation', 'training', 'article', 'prefiltering', 'traf', 'trial', 'sampling'] topic_4: ['knowledge', 'proceedings', 'extraction', 'conference', 'computational', 'language', 'methods', 'entity', 'relation', 'concept'] topic_5: ['social', 'spam', 'tax', 'detection', 'features', 'twitter', 'techniques', 'users', 'accounts', 'cases'] topic_6: ['patent', 'questions', 'question', 'problem', 'patents', 'modeling', 'class', 'study', 'problems', 'classification'] topic_7: ['network', 'networks', 'graph', 'embedding', 'nodes', 'node', 'disease', 'representation', 'gcn', 'drug'] level3 topic_0: ['areas', 'institutions', 'recommendation', 'thematic', 'recommendations', 'system', 'set', 'collaboration', 'technology', 'institution'] topic_1: ['example', 'dream', 'house', 'dreams', 'situation', 'reports', 'flying', 'situations', 'falling', 'groups'] topic_2: ['political', 'model', 'text', 'classification', 'detection', 'work', 'seed', 'label', 'data', 'policy'] topic_3: ['construction', 'research', 'data', 'text', 'argument', 'analysis', 'nlp', 'documents', 'media', 'mining'] topic_4: ['proceedings', 'conference', 'extraction', 'learning', 'computational', 'information', 'language', 'association', 'word', 'methods'] topic_5: ['data', 'news', 'clustering', 'model', 'set', 'methods', 'online', 'models', 'patent', 'problem'] topic_6: ['data', 'clinical', 'clustering', 'trial', 'emr', 'patient', 'patients', 'vector', 'medical', 'trials'] topic_7: ['patent', 'question', 'questions', 'word', 'words', 'summarization', 'based', 'model', 'information', 'data'] topic_8: ['knowledge', 'entity', 'resolution', 'subjectivity', 'methods', 'tax', 'concept', 'entities', 'semantic', 'anaphora'] topic_9: ['clinical', 'knowledge', 'concept', 'argumentative', 'literature', 'inform', 'mining', 'med', 'disease', 'learning'] topic_10: ['model', 'articles', 'data', 'topic', 'training', 'citation', 'article', 'used', 'research', 'topics'] topic_11: ['model', 'models', 'medical', 'clinical', 'bert', 'text', 'biomedical', 'classification', 'language', 'embeddings'] topic_12: ['learning', 'network', 'embedding', 'graph', 'networks', 'node', 'nodes', 'data', 'information', 'representation'] topic_13: ['event', 'social', 'lockdown', 'class', 'ratio', 'learning', 'data', 'media', 'events', 'distancing'] topic_14: ['spam', 'social', 'detection', 'features', 'patent', 'classification', 'learning', 'dataset', 'used', 'text'] topic_15: ['sket', 'reports', 'pathology', 'data', 'disease', 'cancer', 'network', 'concepts', 'networks', 'fication']
Results for level 0 Sparsity Phi: 0.373 Sparsity Theta: 0.000 Kernel contrast: 0.871 Kernel purity: 0.914
Results for level 1 Sparsity Phi: 0.570 Sparsity Theta: 0.000 Kernel contrast: 0.827 Kernel purity: 0.780
Results for level 2 Sparsity Phi: 0.686 Sparsity Theta: 0.000 Kernel contrast: 0.816 Kernel purity: 0.787
Results for level 3 Sparsity Phi: 0.000 Sparsity Theta: 0.000 Kernel contrast: 0.471 Kernel purity: 0.402
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 0 | 1 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 0 | 1 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 0 | 1 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 0 | 1 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 0 | 1 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 0 | 1 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 0 | 1 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 0 | 1 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 0 | 1 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 0 | 1 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 0 | 1 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 0 | 1 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 0 | 1 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 0 | 1 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 0 | 1 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 0 | 1 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 0 | 1 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 0 | 1 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 0 | 1 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 0 | 1 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 0 | 1 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 0 | 1 |
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 0 | 1 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 0 | 1 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 0 | 1 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 1 | 0 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 1 | 0 |
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 1 | 0 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 1 | 0 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 1 | 0 |
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 1 | 0 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 1 | 0 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 1 | 0 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 1 | 0 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 1 | 0 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 1 | 0 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 1 | 0 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 1 | 0 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 1 | 0 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 1 | 0 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 1 | 0 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 1 | 0 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 1 | 0 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 1 | 0 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 1 | 0 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 1 | 0 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 1 | 0 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 1 | 0 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 1 | 0 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 1 | 0 |
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 0 | 2 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 0 | 3 |
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 0 | 3 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 0 | 3 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 0 | 3 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 0 | 3 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 0 | 3 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 0 | 3 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 1 | 0 |
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 1 | 0 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 1 | 0 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 1 | 0 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 1 | 0 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 1 | 0 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 1 | 0 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 1 | 0 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 1 | 0 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 1 | 0 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 1 | 0 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 1 | 0 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 1 | 0 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 1 | 0 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 1 | 0 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 1 | 0 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 1 | 0 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 1 | 2 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 2 | 2 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 2 | 2 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 2 | 2 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 2 | 2 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 2 | 2 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 2 | 2 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 2 | 2 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 2 | 2 |
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 3 | 1 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 3 | 1 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 3 | 1 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 3 | 1 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 3 | 1 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 3 | 1 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 3 | 1 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 3 | 1 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 3 | 1 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 3 | 1 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 3 | 1 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 3 | 1 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 3 | 1 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 3 | 1 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 3 | 1 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 3 | 1 |
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 0 | 1 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 0 | 1 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 0 | 1 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 0 | 1 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 0 | 1 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 0 | 1 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 0 | 1 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 1 | 4 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 1 | 4 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 1 | 4 |
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 1 | 4 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 2 | 1 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 2 | 2 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 2 | 2 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 2 | 2 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 2 | 2 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 2 | 2 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 2 | 2 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 2 | 2 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 2 | 2 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 2 | 7 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 3 | 5 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 3 | 5 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 3 | 5 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 3 | 5 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 3 | 5 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 3 | 5 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 3 | 5 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 3 | 5 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 4 | 0 |
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 4 | 0 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 4 | 0 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 4 | 0 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 4 | 0 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 4 | 0 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 5 | 7 |
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 5 | 7 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 5 | 7 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 5 | 7 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 5 | 7 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 5 | 7 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 6 | 3 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 6 | 3 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 6 | 3 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 6 | 3 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 7 | 6 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 7 | 6 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 7 | 6 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 7 | 6 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 7 | 6 |
| doc_names | cluster_id | max_p_topic_id | |
|---|---|---|---|
| 7 | De_Clercq_et_al._-_2019_-_Multi-label_classification_and_interactive_NLP-bas | 0 | 7 |
| 21 | Jain_et_al._-_2021_-_Summarization_of_legal_documents_Where_are_we_now | 0 | 7 |
| 22 | Jain_et_al._-_2023_-_Bayesian_Optimization_based_Score_Fusion_of_Lingui | 0 | 7 |
| 36 | Othman_et_al._-_2019_-_Enhancing_Question_Retrieval_in_Community_Question | 0 | 7 |
| 38 | Pisaneschi_et_al._-_2023_-_Automatic_generation_of_scientific_papers_for_data | 0 | 7 |
| 25 | Lathabai_et_al._-_2022_-_Institutional_collaboration_recommendation_An_exp | 1 | 0 |
| 11 | Doan_and_Gulla_-_2022_-_A_Survey_on_Political_Viewpoints_Identification | 2 | 2 |
| 48 | Zhao_et_al._-_2023_-_Weak-PMLC_A_large-scale_framework_for_multi-label | 2 | 2 |
| 4 | Bondielli_and_Marcelloni_-_2021_-_On_the_use_of_summarization_and_transformer_archit | 3 | 5 |
| 5 | Curiskis_et_al._-_2020_-_An_evaluation_of_document_clustering_and_topic_mod | 3 | 5 |
| 16 | Giordano_et_al._-_2023_-_Unveiling_the_inventive_process_from_patents_by_ex | 3 | 5 |
| 43 | Timmerman_and_Bronselaer_-_2022_-_Automated_monitoring_of_online_news_accuracy_with_ | 3 | 5 |
| 17 | Gutman_Music_et_al._-_2022_-_Mapping_dreams_in_a_computational_space_A_phrase- | 4 | 1 |
| 9 | Dhayne_et_al._-_2021_-_EMR2vec_Bridging_the_gap_between_patient_data_and | 5 | 6 |
| 20 | Ilievski_et_al._-_2020_-_The_role_of_knowledge_in_determining_identity_of_l | 5 | 6 |
| 24 | Kumar_and_III_-_2011_-_A_Co-training_Approach_for_Multi-view_Spectral_Clu | 5 | 6 |
| 33 | May_et_al._-_2022_-_Applying_Natural_Language_Processing_in_Manufactur | 5 | 6 |
| 14 | García_del_Valle_et_al._-_2019_-_Disease_networks_and_their_contribution_to_disease | 6 | 15 |
| 30 | López-Úbeda_et_al._-_2022_-_Natural_Language_Processing_in_Pathology_Current_ | 6 | 15 |
| 32 | Marchesin_et_al._-_2022_-_Empowering_digital_pathology_applications_through_ | 6 | 15 |
| 3 | Baek_et_al._-_2021_-_A_critical_review_of_text-based_research_in_constr | 7 | 3 |
| 15 | García-Díaz_et_al._-_2020_-_Ontology-driven_aspect-based_sentiment_analysis_cl | 7 | 3 |
| 29 | Lytos_et_al._-_2019_-_The_evolution_of_argumentation_mining_From_models | 7 | 3 |
| 45 | Wang_et_al._-_2022_-_Deep_learning_modeling_of_public’s_sentiments_towa | 7 | 3 |
| 6 | D'Ercole_et_al._-_2022_-_Classifying_news_articles_in_multiple_languages_l | 8 | 10 |
| 13 | Fuenteslópez_et_al._-_2023_-_Biomaterials_text_mining_A_hands-on_comparative_s | 8 | 10 |
| 28 | Lupi_et_al._-_2023_-_Automatic_definition_of_engineer_archetypes_A_tex | 8 | 10 |
| 34 | Medić_and_Šnajder_-_2022_-_An_empirical_study_of_the_design_choices_for_local | 8 | 10 |
| 49 | Zulkarnain_and_Putri_-_2021_-_Intelligent_transportation_systems_(ITS)_A_system | 8 | 10 |
| 0 | Accuosto_and_Saggion_-_2020_-_Mining_arguments_in_scientific_abstracts_with_disc | 9 | 9 |
| 12 | Fu_et_al._-_2020_-_Clinical_concept_extraction_A_methodology_review | 9 | 9 |
| 39 | Pérez-Pérez_et_al._-_2023_-_A_novel_gluten_knowledge_base_of_potential_biomedi | 9 | 9 |
| 8 | Detroja_et_al._-_2023_-_A_survey_on_Relation_Extraction | 10 | 4 |
| 35 | Oral_et_al._-_2020_-_Information_Extraction_from_Text_Intensive_and_Vis | 10 | 14 |
| 31 | Mao_et_al._-_2024_-_A_survey_on_semantic_processing_techniques | 11 | 8 |
| 42 | Strąk_and_Tuszyński_-_2020_-_Quantitative_analysis_of_a_private_tax_rulings_cor | 11 | 8 |
| 44 | Wang_et_al._-_2021_-_Knowledge_graph_quality_control_A_survey | 11 | 8 |
| 18 | Haneczok_and_Piskorski_-_2020_-_Shallow_and_deep_learning_for_event_relatedness_cl | 12 | 13 |
| 23 | Jáñez-Martino_et_al._-_2023_-_Classifying_spam_emails_using_agglomerative_hierar | 12 | 13 |
| 26 | Li_et_al._-_2021_-_Can_social_media_data_be_used_to_evaluate_the_risk | 12 | 13 |
| 1 | Amara_et_al._-_2021_-_Network_representation_learning_systematic_review | 13 | 12 |
| 10 | Di_Girolamo_et_al._-_2021_-_Evolutionary_game_theoretical_on-line_event_detect | 13 | 12 |
| 37 | Paolanti_and_Frontoni_-_2020_-_Multidisciplinary_Pattern_Recognition_applications | 13 | 12 |
| 47 | Zhao_et_al._-_2021_-_Entropy-aware_self-training_for_graph_convolutiona | 13 | 12 |
| 40 | Rao_et_al._-_2021_-_A_review_on_social_spam_detection_Challenges,_ope | 14 | 14 |
| 41 | Ruijie_et_al._-_2021_-_Patent_text_modeling_strategy_and_its_classificati | 14 | 14 |
| 2 | Babaiha_et_al._-_2023_-_A_natural_language_processing_system_for_the_effic | 15 | 11 |
| 19 | Harnoune_et_al._-_2021_-_BERT_based_clinical_knowledge_extraction_for_biome | 15 | 11 |
| 27 | Li_et_al._-_2022_-_Neural_Natural_Language_Processing_for_unstructure | 15 | 11 |
| 46 | Zangari_et_al._-_2023_-_Ticket_automation_An_insight_into_current_researc | 15 | 11 |